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1.  IntroductioD 

Secant  approximations  to  finite  dimensional  matrices  are  used  in  meuiy 
computational  algorithms.  These  approximations  are  matrices  that 

satisfy  a  secant  equation 

/lis  =  y 

for  some  and  seA"*.  The  most  coairaon  appiiC.ations,  reviewed  briefly 

below,  are  in  solving  square  or  rectangular  systems  of  nonlinear  equations,  and 
in  solving  unconstrained  and  constramed  optimizat  ion  proolerr.s  in  this  paper 
we  consider  more  general  approximations  that  satisfy  several  secant 

equations 

=  Y 

for  some  that  has  full  column  rank  ann  anc  the  use  of  such 

approxirfiations  in  solviPig  systems  of  nonlinear  equations  and  unconstrained 
optimization  problems. 

Tne  most  basic  use  of  secant  approximations  is  in  quasi-Newton  algorithms 
for  the  square  systems  of  nonlinear  equations  problem, 

given  F  :  /?"  *R'^  .  find  x^cR'^  such  that  Fixy)  =  0 
These  algorithms  generate  a  sequence  of  iterates  \x^i.  k  =  j. ,  that 

are  increasingly  good  approximations  to  x„.  The  fc  +  l»‘  iteration  is  based  on  an 

afline  model  of  F{x)  around 

f'^lc  r —  F,^X/c  Ai;  t  ^^^X  —  Xj^  ,  ('-I) 

where  is  a  secant  approxri.matwn  to  /-’’fj:*.,;)  that  ooeys  the  secant 

equation 

4+iS*=y*  (1.2a) 

where 

**  ~  ^**1  ~  '  y*  -  ^  \^k)  (12b) 

Equations  (1.1-2)  cause  to  interpolate  F{x)  at  z  =  as  well  as  at 
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I  =  Many  matrices  /Ijtue/?"*"  satisfy  (1.2);  the  standard  way  to  choose 

Ac  n  IS  10  update  the  previous  approximation  -4*  by  Broyden's  update 

(y*  - 


4  +  1  -  4  + 


T 

SiiSic 


(1.3) 


^Broyden  ■.96&j)  This  update  was  shown  by  Dennis  and  .V’orfe  .1977]  to  be  the 
solution  to 


minimize  ^4  -  4  if  sudjccl  to  /Is*.  -  yic 
where  ,  denotes  the  Frobenius  norm, 

=  2  £  . 

1=)  j=i 

That  13,  4t,  is  the  Least  change  secanL  update  to  4  Broyden,  Dennis,  and  Mor6 
:19?;1]  showed  that  the  sequence  of  iterates  generated  by  the  cuasi-Newton 
method 

x*:  +  l  =  It  '  4 

wiln  S4j  generated  by  (1  3)  converges  q-superlineariy  to  a  root  r„  of  Fix)  pro- 
nded  xa  ‘^r'.d  /4o  are  sufficiently  close  to  x,  and  F'ix,).  respectively,  F'(z,)  is 
nons.rigular.  and  F'{x)  is  Ijpschitz  continuous  in  an  open  neighborhood  contain¬ 
ing  X,  For  further  rev.ew  of  secant  methods  for  nonlinear  equations,  see  Dennis 
ar..:  .Mord-  1977]  or  Dennis  and  b'chnaoei  ^:9d3j 

T.  Section  2  we  generalize  all  the  results  stated  in  the  last  paragraph  to 
methoas  where  each  approximation  4m  in  affine  model  (hi)  satisfies  psn 
secant  equations 

4  M'S*  -  Yk  (h4) 

for  6*,  The  obvious  choices  of  5*  and 

•5*8;  =  i*  +  i  -1*  +  !-;  .  >*e;  =  /^(i*m)  *  /^(i*  +  i-j)  ■  j  =  P 
where  Sj  denotes  the  unit  vector.  If  4  +  i  satisfies  (1.4-6),  then  the  affine 

model  (1.1)  interpolates  F{x)  at  Xt»,_p.  •  In  Section  2  we  give  the  gen¬ 

eralization  of  Broyden’s  update  that  satisfies  (14)  and  show  that  it  is  the  least 
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change  update  satisfying  these  equations.  VVe  also  give  conditions  on  {S’*}  and 
\  Yt )  under  which  the  quasi-N'ewton  method  using  this  generalized  Broyden's 
update  IS  locally  q-superiineariy  convergent.  The  material  in  Section  2  is  only  a 
modest  generalization  of  Gay  and  Schnabel  llSTB],  it  is  included  because  the 
proofs  are  simpler  and  clearer,  and  to  motivate  the  materau  in  Section  3. 

The  other  problem  considered  in  this  paper  is  tne  unconstrained  minimiza¬ 
tion  problem, 

mimmaze  fix)  :R^->R  R) 

Tne  first  order  necessary  condition  for  to  be  a  solution  of  (  -.6)  is  V/(x,)  =  0, 
so  []  6)  can  be  considered  a  special  case  of  the  nonlinear  equations  problem 
wrere  F{x)  =  V/(z)  While  this  viewpoint  has  limitations,  it  is  useful  in  motivat¬ 
ing  secant  methods  for  unconstrained  minimization  in  particular,  secant 
methods  for  (1.6)  base  the  Airl**  iteration  on  a  mode!  m{x)  of  /  (x)  around  x*^i, 

♦  1 )  ~  /  ♦  l)  n)  ~^k  n)  ~^k  *  1 )  +  1  ~~^k  + 1) 

where  is  an  approximation  to  V®/(x*ti)  If 


^k*^Sk=yk  {'--7) 

where 

•'?*  =  ^*fi .  y*  = '7/(x*,,)  - '7/(x,t) 
then  Tm, ,i(x)  interpolates  V/(x)  at  x*  and  x*t,.  The  major  difTerence  between 

seca.ni  methods  for  nonlinear  equations  and  unconstrained  minimization  is  that 
’.n  unconstrained  minimization  V^/(x)  is  symmetric  so  the  approximations  \Ht  l 
shouic  be  too. 


Powod  ,1970]  introduced  a  symmetrized  version  of  Broyden's  update  that 

satisfies  ( 1  7). 

,,  _  ,r  .  u  ''■(yk-f^kSkVsk)  StsJ  ,  , 

-  "•  - 3^ - *  - (35? - 

and  this  update  is  known  as  the  Powell  symmetric  Broyden  (PSD)  update 
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Dennis  and  Morfe  showed  that  (1.8)  is  the  solution  to 


minimize  \H  - /4iif  subject  to  H  symmetric,  s*  -  yk  '  ■  Qj 

provided  that  /4  is  symmetric:  that  is.  (1. 8)  is  the  least  change  symmetric 
secant  update  to  Broyden.  Dennis  and  Vor^  '97T]  showed  that  the 

sequence  of  iterates  generated  by  the  quasi-Newton  method 


X*.,  - //„-'V/(x*)  (1.10) 

with  j/4i  generated  by  (1.8)  iS  locally  q-superlmearly  convergent  to  a  mimmizer 

X,  of  /  (x)  under  appropriate  assumptions 

Two  otner  symmetric  secant  approximations  to  however,  have  been 

more  successful  in  practice  They  are  the  Bl-’CS  and  DFP  updates.  The  BFGS 
update,  namicd  after  its  proposers  Broyaen  lOYOj,  Fletcher  /.970j,  Goldfarb 
1970],  and  Sheinno  1 1970],  is 


ri  —  ff  , 

+  r - 


Vk  Vk  HkSksiHk 


(1.12) 


yiSk  s//4st 

The  D'F?  update,  named  after  its  originators  Davidon  ]1959]  euid  Fletcher  and 
Powell  ^  1963],  IS 


lu 


iyk-^k>''k)yk  ^  VkiVk-fuf'-k)’'  i(y\-f^k>‘'k)'’'sk)  ykVk 

-  "•  - - 


ykSk 


Pr.th  updates  obey  (17).  and  have  the  additional  desirabio  property  that  if  is 
symmetric  and  positive  definite  and 


y/s*,>o.  (1.14) 

then  /4ti  IS  well-defined,  symmetric  and  positive  definite  In  practice.  Ho  is 

c’iiost:,".  .symmetric  and  positive  definite  and  (1  14)  is  enforced  by  the  line  search, 

so  each  /4  is  symmetric  and  positive  definite  Dennis  and  Mor6  [1977]  showed 

that  both  the  BFGS  and  DFP  updates  are  least  change  symmetric  secant  updates 

in  an  appropriate  weighted  Frobenius  norm,  provided  that  /4  is  sjmmetric  and 

(1  14)  holds.  The  DFP  update  is  the  solution  to 


a 

rninimi/.e  - /4)  suDject  Lo  // symmeir;c,  //s*  =  y*  (lib) 

c'.r.a  Lho  iiFCiS  update  is  tne  soiulion  to 

minimize  '  -  /4  '  -i--  subject  lo  //  svn,nietric,  H  -  yjc 

wr.iTi-  in  boLh  cases  ;i  any  noi'.;).r.y>..^ir  it.-.x  'of  .li.ch  a  ^ 

Hrj>uen,  Dennis  and  Vord  _'973|  showed  trial  the  iieraies  generated  by  (hlO) 
us.ng  either  the  BFCS  or  DPT-'  update  lo  aeneriilc  ;/4i  converge  locally  and  q- 
s'upcrlinearly  lo  a  nunimizer  x,  of  /(i)  under  reasonable  assumptions  Algo- 
fiitims  using  the  BFGS  update  have  proven  to  he  the  most  'Thusl  and  efficient 
stv  unt  algorithms  for  unconstraineu  mmamizai lOn  i.;  prachce  i-'or  more  infcr- 
■..ativ.n  or.  secant  metheds  for  uncor.strainoc  m. aim. /at. on  .see  Dennis  and  V.ord 
'aV/.,  r'.etcner  '.9o0i.  Gill  .\furray.  ana  'Ar.gnt  or  .icrmis  and  Schnaoei 

' '  963’, 

c't.v  t.on  3  of  tms  paper  considers  methoc.-;  for  uncons; ra.nc-c  mirumization 
where  the  Hessian  approvimation  is  askea  ic  saiisfyp<n  secant  equations 

^kuSk  -  '^ic  (1.17) 

for  h'i-,  h*  .  If  and  V*  are  chosen  .n  the  obvious  way 

3*  "I'  —  ,  I  —  Xjc  I  . J  .  •  *  f  '\Xic  n)  ~  t  J  ■  j  ~  •  .p  ,  (  i.  ■  .  B) 

t.non  tne  new  quadratic  modei  woula  .nterpoiatc  the  p  most  recent  previous  gra- 

/y  =  j=C.  p  (119) 

i.owi.vfr  y.  '.9i  may  be  inconsistent  wn.,'.  tn..  req>.*.rcn'.er.t  tr.uit  oe  sym.- 

r;e,r;c  hcction  3  '.  gives  very  .-ampic  necessary  .ind  sufTiCicnt  conditions  for 

there  to  oe  symmetric,  or  symmetric  and  positive'  definite,  satisfying 

1.  ,7)  if  these  conditions  are  satisfiea,  tnen  ail  the  resuits  about  the  PSB, 


Bi'GS,  and  JFF  updates  mentioned  in  the  two  previous  paragraphs  can  be  gen- 
erah/eri  to  symmetric  updates  that  satisfy  (:  .7),  and  to  mimmization  algo¬ 
rithms  that  use  these  updates.  .Section  3  2  gives  the  generalizations  of  the  PSB, 
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Dr’l^,  and  BFGS  updates  that  satisfy  multiple  secant  equations,  and  shows  that 
they  are  least  change  symmetric  updates  in  the  same  norms  used  in  (19), 
;■  and  11  '.6)  respectively  Sectmn  :in  ccr.sidcrs  a  special  case  of  sym¬ 
metric  updates  satisfying  multiple  secant  equations  that  has  received  consider- 
aitia'.Lion,  the  "projected"  updates  introuaced  b\'  Davidon  l97oj  and  subse- 
q  .cnt'.y  considered  by  Dennis  and  Schnabel  9Hi],  Nazareth  ,  I976j,  Schnabel 
■  9  i’7,  .978],  and  others.  Here  one  assumes  that  //*.  already  satisfies  p-1  of  the 
p  1  ecant  equations  imposed  upon  /4n  Wc  shew  that  several  of  the  projected 
.  ,’.,r, 'OS  aenved  by  those  authors  are  special  cases  of  the  generalized  PSB,  DIT 
li:  .:.s  updates  given  in  section  3  2.  Section  3  gives  conditions  on  and 
,,'.der  wh.ch  tne  iterates  generated  by  j'.,  using  the  generalized  PSB, 
.h- ,:r  updates,  converge  locally  and  c-supcrhnear'.y  to  a  mimmizer  x,  of 

/  ,x',  7ne  proofs  require  only  minor  modification  of  the  techniques  of  Broyden. 
Dennis  and  Hord  /*975]  and  Dennis  and  V.ord  197-'';]  Finally  in  Section  3.5  we 
prepose  several  ways  for  the  preceding  material  on  unconstrained  minimization 
to  have  practical  application,  by  suggesting  several  reasonable  modifications  of 
4  given  by  that  would  allow  symmetric  (and  positive  definite)  updates 

satisfying  llkt\Sk  -  Yk  to  exist  These  modifications  to  4  do  not  alter  the 
current  secant  equation  fJk,i^k  -  Vk-  '^^d  alter  the  ether  secant  equations  m  a 
re.isonable  way  The  resultant  algorithms  obey  the  conditions  of  Section  3.4  for 
c  - 1.1  p  t.  r . .  r.  c  a  r  c  o  n  V  e  r  g  e  n  c  e . 


2.  Multiple  secant  equations  for  nonlinear  equations 


The  most  basic  use  of  secant  approximations  is  in  quasi-Newton  methods  for 
soiv.nq  systems  of  nonlinear  equations.  The  approximation  problem  underlying 
i.-i  -.lanuard  me^hoa^  is  to  fina  an  A,  -  mr  .vn.cn  l.-s  a  '.vhere  xe/?"  and 

lyt.'V"'.  \s  we  mentioned  in  Section  1.  tne  most  successful  practical  method  is 
bused  on  ciioosing  the  .K  that  solves 

mimmize  .4^  -  A  y  subject  to  .4,  s  -  y 

wOere  .4  >_ /V'””"  11^0  gencrali/aLion  of  this  upproximaticn  problem  to  multiple 

secant  equations  is 

mirumize  .A,  “  <‘1  y  subject  to  .4,  S’  -  '/  CP  -s 

,1,c  \  -J 

wr,'',-"  y\  The  solution  to  (2  '.)  is  given  in  Theorem  2  1. 

Tlie  remainder  of  this  section  discusses  methods  for  solving  square  systems 
d!  n.manear  equations  wiicrc  at  each  iteration,  the  update  given  in  Theorem  2.1 
used  to  cuccuiate  a  Jacobian  approximation  /itn^ that  satisfies 
-  ‘K  fof*  ^’ome  i’l; ,  )'fc  A  special  case.  cons. acred  by  flames  ’’1965] 

.me  '.i.iy  ..'.nd  ^’chnaDel  T.97B].  is  when  each  update  enforces  the  new  secant 
<  Cu.ition  and  preserves  some  old  secant  equations  sat.sfied  by  /t*  Updates  with 
:r,.s  property  are  sometimes  called  "projected  secant  updates".  The  least 
ch.uige  projected  secant  update,  a  simple  corollary  of  Theorem  2.1,  is  given  in 
t  uro.uury  2  2.  Theorem  2  5  then  gives  general  conaitions  on  and  for  a 
quas. -Newton  method  based  on  least  change  multiple  secant  updates  to  be  q- 
ime.iriy  or  q-.superlinearly,  convergent  It  uses  a  generalization  of  the  Broyden, 
Del. .MS,  and  V'ord  ^'.973]  bounded  detcriorai .on  theorei.i  that  we  state  in 
Theorem  2  3,  and  the  Dennis-Mor6  197^.)  characterization  of  q-superlinear  con¬ 
vergence  that  wo  state  in  Theorem  2  4  Corollary  2  6  shows  that  the  q- 
supcrlinear  result  of  Theorem  2.b  applies  to  a  class  of  methods  that  enforce  the 


1, 
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curroni  and  some  past  secant  equations,  including  the  method  of  Gay  and 
Schnaoci  This  class  of  methods  also  incluaes  some  algorithms  not  considered 
by  G.xv  and  Sc'nnabel  that  may  be  of  practical  interest 

Theorem  2.1.  Let  /)<n,  C'" i;. A' " cd^vh^yS)  -  p  Then  the 

unique  solution  to  (2.1)  is 

/I,  =  ^  +  (  r  -  A  A')  '  .v"  (2  2) 

Proof:  It  is  straightforward  to  derive  (2  2)  by  regardiru:  (2  1)  as  m  linear  least 
sqviares  problems  =  Zj.  i  =  '..  .m  in  the  variaoies  6j.  where  =  row  ?1  of 
A, -A  and  z,  =  row  i  of  (y-AS")  A  oilTeroni  proof,  given  below  uses  techmques 
o;  Ltnnis  and  V(or6  ,  I97?j  that  are  more  closely  related  to  the  techniques  we  will 
u.se  .n  .Section  3 

C.  ..arly  .-U  given  by  (2,2)  is  well-defined  and  satisfies  .4,  6'  =  Y.  Now  let 
j)  r^mxTi  satisfying  B  S  =  Y  Substituting  BS  for  Y  ;n  (2  2)  gives 

A,  -  A  =  {D  -  A)  S  (5^5)  '  5^  ^  [B  -  A)  P 
wr.ere  P  -  5(5'^5)’‘5^  is  a  Euclidean  projection  matrix  and  thus  ..P  ^ 

Tr.ere.ore 

i.At  -  A  f  <,  ,  B  —  A  ,f  ,P  z  -  B  ~  A  y 
Vne  .solution  is  umque  because  (2  l)  is  a  minimization  problem  in  a  strictly  con¬ 
vex  norm  over  a  convex  set.  • 

The  u.se  of  secant  updates  .n  solving  systems  of  nonlinear  equations  was 
reviewed  .n  Section  T  The  standard  secant  update  for  nonlinear  equations. 
Uroyuc  "i  -  upd.itc.  causes  the  affine  model 

-  B{xiin)  +  Atn  (x  -  Xfcn)  (2  3) 

of  F[x)  around  x^^,,  to  interpolate  F{x)  at  x*  and  x^^  .An  obvious  use  for  multi¬ 
ple  secant  equations  in  solving  systems  of  nonlinear  equations  is  to  cause  (2.3) 
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lo  interpolate  F{x)  at  additional  past  iterates,  F’or  example,  if  is  a 

sequence  of  past  iterates  satisfying 

k  =  liic  >  L21C  >  >  Ipi^t  ^  0  ,  (2.4) 

and  where 

Sic^j  =  ~  ,  (2,5) 

then  .'4  +  i(ic)  interpolates  F[x)  at  J=2,  ,pt  as  well  as  at  and 

Conditions  for  a  method  based  on  the  above  secant  equations  to  be  q- 
superlinearly  convergent  are  given  in  Corollary  2.6  (Clearly.  .4  must  have  full 
column  rank  to  guarantee  the  existence  of  4 , ,  ) 

,4  special  case  of  the  above  is  when  all  but  one  of  the  function  values  that  we 
ask  .'4n;x)  to  interpolate  already  are  interpolated  by  .'4(x),  Barnes  ('.965]  and 
Gay  and  Schnabel  ''.976]  consider  a  strategy  that  has  this  property.  They  ask 
the  model  /r4n(x)  to  interpolate  F{x)  alp,  consecutive  past  iterates,  as  well  as 
at  z, ,  1  In  the  notation  of  the  previous  paragraph,  this  means  that  Ij*  =  k  +  \-j  , 
,p*.  Thus 

=  z*,^i  -Zfc.i.j-  ,  Yte.j  -  /'(^.i)  ”  ^'(^fi-;) 

Due  to  the  linearity  of  the  model  (2.3),  it  is  equivalent  to  define 

~^k^\  j  •  T*  Cj  -  /■  (z,  t ;  .^  ) 

j.^arr.es  and  Gay  and  Schnaoel  also  assume  that  p,  j  +  l ,  mconir^  that  any 

prev. ous  function  values  that  ,'.4,i(z)  shouiu  interpolate  already  are  interpo- 
uteu  by  .'4(^)  if  the  secant  conditions  are  defined  by  (2.7),  this  implies  that 

{Yk  ~  Ak  St)  ej  ~  0  ,  j  =2,  ,pt 

so  that 

Yk  ~  Ak  Sk  =  {Yk  ~  Ak  Sk)  e  ,ej  =  (yt  —  At  Sk)  ej 
where  4  ns,  =  y*  is  the  current  secant  equation,  1  e. 


(2,6) 

(2,7) 


s*  =  Xkki  -Xk  .  yk  -  F{Xk^^)  -  F{xk)  . 
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If  the  seeMiiL  equations  are  defined  by  (3.6),  then  it  is  easy  to  show  tnat 

( >'*  - /t*  At)  Bj  =  ,  pt 

.--o  1  r.  r. 

‘  K  ■  *K  '  k  \lfK  ‘'k^k  /  \  -  •  / 

Ir.  L-.i-.er  ease  -  A  S')  is  a  ratiK  one  matrix  Coronary  2  2  ^nows  mat  tne  least 
change  naatiple  secant  update  is  a  rank  one  upcate  in  Lhis  case. 


Corollary  2.2.  Let  the  conditions  of  Theorem  2  :  be  sat.sficd,  ana  let  T  -  Afj  = 
[y  —  .-is)  where  i;  •£ /t’"  is  nonzero.  Then  me  unique  .''CiULio.n  to  ;3,1)  is 


/l.^  =  /I  T  (y  -  /Is)  n,' 


w.-.t' re 


-w  =  S  iS~S)  -'  V 

Proof;  , remediate  from  Theorem  3  L 


if  1/'  =  c,  as  in  the  methods  of  Barnes  ana  Gay  and  Bchnaoe*,  then  it  is 
straightforward  to  show  that  le  is  a  multiple  of  the  LuCiidcon  projection  of  the 
first  column  of  5  onto  the  linear  subspace  orthogonal  to  the  remaimng  columns 
of  The  term  "projected  secant  update"  comes  from  this  relationship. 

.A  a  cai  method  based  on  the  multiple  seccint  updates  discussed  above  is  to 
.■ieloct  cachx^,.!  to  be  the  root  of  .'r4(x). 

ifcfi  =  Xfc  '  AkT'/^vifc)  ,  (2.3) 

tnt  n  choose  and  update 

>1*M  =  ^  +  (h*  -  (6'J'A*)-‘  (2.9) 

Theorem  2d  gives  necessary  conditions  on  and  1  >*  j  for  the  sequence  of 

Iterates  generated  by  (2.8-9)  to  be  loceilly  q-hneariy,  or  q-superlinearly,  conver¬ 
gent  to  a  root  z,  of  F(x)  where  /■’’(x,)  is  nonsingular.  The  linear  result  is  based 
on  Iheorem  2  3,  a  slight  generalization  of  the  bounded  deterioration  Theorem 
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3  2  in  Oroyden,  Dennis,  and  Mor6  i!973j.  which  differs  only  in  that  =  0.  The 
proof  of  Theorem  2  3  is  omitted;  see  Theorem  9.2.2  of  Schnabel  '1977]  for  a 
proof  of  a  slightly  more  general  theorem  The  superlinear  result  is  based  on  the 
well  known  theorem  of  Dennis  and  Mor6 ' 1974]  which  wo  restate  in  Theorem  2.4. 

In  the  remainder  of  this  paper,  ;  aenotcs  the  l_.  vector  or  matrix  norm. 
For  any  with  full  column  rank,  A'(S)  denotes  the  Zg  condition  number  of 

S’,  K{S)  =  !j5||  ;i(S’^5)'‘  s'',,.  For  any  xe/?",  we  define  Nix.-q)  to  be  the  set 

c/?"  ;  ,;Z  -ili<Tjj. 

Theorem  2.3.  (Broyden,  Dennis,  and  Mor6  1973],  Sch.nabei  1977]) 
i£t  F  /[”*-•/?"  be  continuously  differentiable  in  an  open  convex  set  D.  and 
ussu.me  tnere  exists  x,'CD,  r}>0.  and  satisfying  F(x,)=0,  F'{x,) 

IS  nonsinguiar,  and  ’F  iz)  -  F'{x)  <  y  z-x  for  oil  z,  z  '^N[x'.r,).  Consider 
the  sequence  j  of  pomts  in  /?”  generated  by  ^2.8).  where  the 

sequence  (/lo,  Ai,  ■  ■  j  of  matrices  in  satisfies 

hA*i  - s  ;  4  - /'’'(•r.)  ./••(:+ ui  Mt)  +  OaM*  .  (2.10) 

=  max  -  z,  ,  ,x*  -  X, x*_,^-x,  ,(  (2,ll) 

k-O,:.-  for  some  fixed  oi^O,  Og^O,  with  g*  =  minjA,-,!?]  for  some  fixed  g^O. 
Tnen  for  each  re(0, 1),  there  exist  positive  constants  E[r).  6{r)  such  that  if 
^  ^nd  \Aq-F  !,x,).,f  6{r),  the  sequence  jxc,  Xj,  j  is  well- 

defined  and  converges  tox,  with 

-X,  .^r  ,,x*  -X,  , 

for  ail  fc  Furthermore,  ^nd  sire  uniformly  bounded. 

Theorem  2.4.  (Dennis  and  .Mord  1974}) 

Let  the  assumptions  of  Theorem  2  3  hold.  Let  ]4i  be  a  sequence  of  nonsingulau" 
matrices  in  and  suppose  for  some  x^eS’'  the  sequence  of  points  gen¬ 

erated  by  (2  8)  remains  in  D  and  converges  to  x,.  with  i*  z,  for  any  k  Then 
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jx*  i  converges  q-superhnearly  to  z,  if  and  oniy  if 

’.(A;  —  F  (z,))  (Z|t,  1  —  z*)  ' 

hm  — - ^  ‘  ^  =  0  (2.12) 

*—  ifcn  - 

Theorem  2.5.  Let  the  assumptions  of  I’ncorcm  2  Ti  ho:u.  Consider  the  sequences 
(X^j  and  (.4^J  generated  from  xq^R^  and  by  (2.8-9)  where  5*, 

with  each  ptG_l,n].  Suppose  there  exist  Ci=>0.  c^&l,  q>C,  such  that  for  k  = 
0.  1,  ■  . 

•  Yie  -  F'(z,)  -s  c  I  5*/ max'S  x*. -  z,;  j  ,  i  =  -:,0 . g*  (2.13) 

and 

K{:S^)^c^  (2.14) 

where  each  qt  ^  max|^,gj  Then  there  exist  £>0.  (5>0  such  that  if  |zo  -z,,;  <  e 

and  4c  ~  (^n)  ;  ^  <5,  the  sequence  jz*}  is  well-defined  and  converges  q-linearly 

to  z,.  and  are  uniformly  bounded.  If  in  addition,  for  each  A:  there 

exists  for  which 

■S*  ~  ^k  H  ~  ■  (2.i.5) 

then  tne  rate  of  convergence  is  q-superlinear 

lYoof;  Let  J,  =  F  (z,).  From  (2.9). 

(4.1-^.)  =  (4--/.)  (/-5,(5/5J-'5<D  +  (yk-J,  5*)  (5t^5*)->  5/  .  (2.16) 
Define  P*  =  ^'k(SkSk)~'Sk.  and  recall  that  ,/  -  Rk..  -  L  Then  using  also 

(2  '.3),  with  /ifc  defined  by  (2. 1 1).  in  (2  16)  gives 

^4  t'i  ~  J  -  I  4  ~  J  Mi.F  I  ~  Pk.  ^yk  ~  J  m  Sk  \  F  \'\SkSk')  ^  Sk\t 

^  I  4  “  J ■IIa’  c  I  F{Sk)  fJ-k 

Thercfcri-  from  (2.14),  4*i  satisfies  (2.10)  with  and  Og^CiCc,  which  proves 
q-linear  convergence. 

To  prove  q-superhnear  convergence,  define  £*  =  (4  ~  >^»)  Since  Pk  is  a 


Huclidean  projection  matrix. 
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nicn  from 


K.  J  -  })^  < 

;2.:6).  (2  ;7),  (2  13).  and  (2.:-;), 


F  ~ 


Elc  Pk.,P 
2  ii-fc*  ;/• 


(2,17) 


wruch  implies 


i  £*.  2  ( .Et  F  -  £4.1  >’  +  ^ir-2Uk)  (2. 18) 

i-’rom  the  proof  of  linear  convergence,  there  exist  p,/?f  (0  ■^)  such  that  V  fik  p 

1=0 

and  /■■*  y  ^  for  all  k  Using  these  bounds  and  summ.ng  ^2  id)  from  k~Q  to  j 
gives 

Ii£4  Ek-.f  ^  2  Ec  F  ~  :£*  +  l.  F  +  '■  jCg  ^  Pic)  -F  z  P  [p-t-p) 
fc=0  t=c 

which  proves  that 


hm  A*  Fic  y  =  0  (2.19) 

I’lnally  we  show  that,  if  (2, 15)  is  true,  then  (2,19)  implies  the  Dennis-V.orfe  condi¬ 
tion  (2  12)  for  superiinear  convergence.  Define  Sje  =  (x*,)  -  x*)  Then  from 
(2.15). 


£4  £*  -  A'fc  £4  ( £4^54  )  ■  ’■  Sif  S/c  Vie  -  Etc  5't  Vie  -  E/e  Sie  ■ 

so  l.-.dt  by  the  definition  of  an  induced  matrix  norm 


„  r,  ^  {-‘kFicSie..  tie  Sie 

Afc  Fie ;;  > - =  -  , 

..s*  Sie 

and  from  (2. 19) 

'£4  Sfc 

lim  — - -e  lim  :  A4  /4  =  0  , 

Thus  the  method  (2  8-9)  satisfies  condition  (2.12)  of  Theorem  2.4  eind  is  q- 
superiinearly  convergent,  • 


Theorem  2  5  says,  roughly,  that  if  /4.i  £4  =  ^  'ire  reasonable  secant  equa¬ 
tions  in  that  £' (x,)  5*  is  close  enough  to  }'*,  and  if  the  columns  of  S’*  are 
sufTiciontly  linearly  independent,  then  the  method  (2,8-9)  will  be  locally  q- 
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linearly  convergent,  if  in  addition  the  most  recent  secant  equation  “  Vk 

always  is  included,  the  method  will  be  q-superlinecirly  convergent.  Corollary  2.6 
'■nows  mat  the  choices  of  ,md  I*  given  by  (2  ••-;•>),  which  cause  .Wfc,,(x)  to 
interpolate  F{x)  at  p,.  not  necessarily  consecutive  past  iterates  including  the 
must  recent,  satisfy  these  criteria  as  long  as  the  past  iterates  are  cnosen  so  that 
each  5*  is  sufficiently  linearly  independent,  and  there  is  some  upper  bound  on 
how  many  iterations  back  the  secaint  equations  can  go. 

Corollaury  2.6.  Let  the  assumptions  of  Theorem  2  3  hold,  and  let  gSsl  be  fixed. 
Consider  the  sequences  and  generated  from  x^cTt’”  and  /IcC/?"*”  by 
(2.0-9),  wncre  for  eachA:,  l  ^  <  minjA:  +  l,  n,  q  j,  6*.  >',cC  with 

A'(5*)<c 

for  some  fixed  c^l,  and 

SjcBj  =  Xfc^i  -  ,  Yi,ej  =  F{xt*i)  -  F{xt.i^)  .  ;  =1,  •  ,pfc 

where 

A:  >  i  i*  >  fj*  >  >  >  maxJO,  fc  + 1  -5  j  . 

Then  there  exist  r,  d  >  0  such  that  if  „xo-x„,,<k  and  - /’’(x,);;  <  5,  the 
sequence  jx^j  is  well-defined  and  converges  q-iinearly  to  x,.  Furthermore,  {/}*{ 
and  (/VM  are  uniformly  bounded.  If  for  all  k.  the  rate  of  convergence  is 

q-superlinear. 

Proof:  13y  a  well  known  lemma  (see  for  example  Section  3.2.5  of  Ortega  and 
F.heinboldt  ,  '.QTOj), 

' y* -F'[xn)S„)  p.j\ <  7  "x**,-x,^^''  max  {-x^.i-x.  ;,  'x,^^-x,'  i  -=  7  li-Siej  '  Pt  . 
where  the  last  inequality  uses  only  the  definitions  of  5*,  and  of  /x*  from  (2.11). 
Thus 

WYk  -  F'{xt)  Sic'uf^y  uStl.F  F-k  '^7  '\Sk\'  F-k  . 

so  (2  '.3)  IS  satisfied  and  q-linear  convergence  is  established  by  Theorem  2.5.  If 
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foi"  then  q-superlinear  convergence  follows  tnvicdly  since  (2.15)  is 

true  with  v*=e  I  for  all /fc .  ■ 

The  strategies  covered  by  Corollary  2.6  for  choosing  the  past  iterates  whose 
function  values  the  model  will  interpolate  include  the  strategy  implemented  by 
Gay  and  Schnabel  [1976],  as  well  as  the  strategy  used  by  Schnabel  and  FVank 
i  1983]  in  their  "tensor  method"  for  nonlinear  equations.  Schnabel  and  Freuik 
always  select  S^ei  =  (i**.i  -z*).  Then  they  consider,  in  order,  the  steps  from 
X*,.,  to  Xt_i,  ■  •  •  ,  Xi*i_,;  they  include  xj^n  -  x,^  as  a  column  of  if  and  only 
if  it  makes  an  angle  of  more  than  45°  with  the  linear  subspace  spemned  by  the 
already  selected  columns  of  S*.  Their  experience  is  that  the  best  results  eire 
obtained  using  only  information  from  fairly  recent  past  iterates:  they  restrict 
Pjc,  and  q,  to  be  at  most  "v/n  .  This  strategy  allows  considerably  more  flexibility 
in  choosing  past  iterates  than  the  strategy  tested  by  Gay  and  Schnabel;  it  would 
be  interesting  to  test  a  secant  algorithm  for  nonlinear  equations  that  uses  it. 
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3.  Multiple  secant  equations  for  unconstrained  optinuzation 

Now  we  turn  to  the  unconstrained  minimization  problem  ;  1  6),  which  we 
reviewed  briefly  in  Section  1.  The  standard  quadratic  model  of  the  objective 

fur.ction, 

m*.»i(z)  =  +  )^(x-Xt,,'//4,i(x-Xi4,i)  ,  (3  !) 

could  interpolate  several  past  gradient  values  if  the  symmetric  approximation  to 

the  Hessian  //*  +  i  obeyed  several  secant  equations 

/4m  5*  =  y*  ,  (3.2) 

where  :r>ic.  T*  are  given  by  (1  18)  Several  authors,  starting  with  Schnabel 

_  :9  .''7].  have  noted  that  (3.2)  may  be  inconsistent  with  the  symmetry  of  In 

Section  3  1  we  show  that  there  exists  a  symmetric,  or  symmetric  and  positive 

definite,  Z/*,;  satisfying  (3.2)  if  and  only  if  T/s*  is  symmetric,  or  symmetric  and 

positive  aeflrute.  respectively  While  the  natural  choices  ('..18)  of  5*  and  T* 

sat.sfy  tnese  conditions  if  f[x)  is  a  positive  definite  quadratic,  for  general  fix) 

usually  is  not  even  symmetnc  In  Section  3  b  we  attempt  to  remedy  this 

difTicuity  by  proposing  several  reasonable  ways  to  perturb  4  to  a  Yt  for  which 

>4-7*  IS  symmetric  and  positive  definite.  The  preceding  sections,  3. 2-3, 4,  discuss 
the  updates  and  methods  that  may  be  used  if  the  conditions  for  symmetric  (and 
pcs; t, VC  definite)  multiple  secant  updates  to  ex.st  arc  satisfied.  Section  3  2 
introduces  generalizations  of  the  PSD.  DFP,  and  Di’GS  updates  that  satisfy  (3.2) 
and  shews  that  they  are  the  icast  change  updates  in  tne  appropnaie  norms.  In 
Section  3  3  wc  show  inat  several  "projected  secant  updates"  ihat  have  been  pro- 
pr:  --cd  for  unconstrained  minimization  arc  special  caso.s  of  t.he  updates  discussed 
;n  Section  3  2  Section  3.4  shows  that  quasi-Newlon  methods  based  on  our  gen¬ 
eralizations  of  the  PSB,  DFP,  or  BFGS  updates  are  locally  q-superlinearly  conver¬ 
gent  under  standard  assumptions.  The  methods  proposed  in  Section  3.5  satisfy 
the  conditions  for  q-superlmear  convergence. 


3.1.  Necessary  and  Sufficient  Conditions  for  Symmetric  kultipie  Secant 
Updates 


Theorem  3.1.  Lot  5,  ranK(5)  -p  Then  mere  exist  symmetric 

such  Lhai  =  Y  if  ana  only  if  Y^S  is  sy.mmetric.  There  exist  sym- 
meiric  ana  posuive  aetunite  such  that  Ht.S  =  Y  if  cind  only  if  Y^S  is 

svTnmetnc  and  posit.ve  definite - 


Proof  :  Only  if  .  Suppose  mere  exists  a  symmetric  for  which  H+S  -  Y.  Then 
>'■  -  Y''S  is  sy. mn, c-i  ."ic.  Similarly,  .f  is  s^/mmcv.r;c  ..na  positive  cefini'i,e, 

men  S'^A' ,5  =  is  symmetric  ana  posiiive  aefimte 
If  :  Suppose  is  symmetric.  Then 

-  YiS^'S)-'Y'^  -i-  0\5"5)'  -  S’(5"5)  S)iS^  13.3) 

is  wel.-aefined.  symmeiric,  and  ooeys  //,6'  =  Y.  .Now  suppose  Y^S  is  symmetric 


and  positive  definite.  Then 

Hz  =  y  [y'‘'s)-'  Y^ 

IS  well-defined,  symmetric,  obeys  H^S  =  Y.  and  is  at  least  positive  seml-definite, 
.4iso  ranK(y)  =  p  from  Y-''s  nonsingular.  Thus  it  p=n  Hz  is  positive  definite.  If 
p<7i.  lei  be  any  matrix  whose  columns  edl  are  in.  and  together  span. 

Li..;  ;'.ui.  spii.ie  o'  o',  Inal  is,  Z'  S  -  0,  m  3:n-p,  and  rank(Z)  =  n-p.  Then 


Hs  =  y  cy'mo’)-'  yr  +  Z  z’’  (3  4) 

i.s  weli-acfinod,  symmetric  obeys  -  Y,  and  is  at  least  positive  semi-definite. 

Now  lol  be  an  orthonormai  basis  for  the  null  space  of  S .  Then  Z  - 

i'.\-  wnere  .Vo has  full  column  rank.  i.e.  N  is  nonsingular.  Then 
from  (3.4). 


where  Hi.  Mz 


Hg  =  Ml  Uz  M\ 
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Wi  =\Y  'U  \  .  Mz  = 


t 

!  1 

i 

'  /V^.V; 

iCcirly  A/j  IS  nonsinguleir,  and  since 


5|. 


Y^U\ 

-  i - : - i, 

’  0  ;  /  •; 


Ml  IS  nonsingulcir.  Therefore  is  nonsingular  and  hence,  positive  definite. 


Note  that  the  above  proof  could  be  simplified  slightly  by  defining  Z=U,  how¬ 
ever  the  more  general  definition  of  Z  will  be  useful  to  us  in  Section  3.2. 

Now  let  us  consider  whether  the  conditions  of  Theorem  3.1  cire  likely  to  be 
satisfied  in  the  context  of  an  unconstrained  minimization  algorithm.  Suppose, 
as  uo  Section  2,  that  is  a  sequence  of  past  iterates  satisfying  (2.4)  and  5*, 

are  defined  by 

-xi.^  ,  =  V/ (x* , -  V/ .  (3.o) 

If  /(x)  IS  quadratic,  then  }'*  =  ^f{x)  5*.  so  I'/S*  is  symmetric  for  any 

and  y/6*  is  positive  defimte  if  V'‘^/(x)  is  positive  definite  and  5*  has  full  column 
rank.  When  /(x)  is  not  quadratic,  however,  it  is  unlikely  that  Y^Sie  is  sym¬ 
metric,  as  illustrated  by  the  following  example 


hbcamplc  3.1.  Let  xe/?^,  /(x)  =  ji^(x' 1  j)^ +•  >j(x ;2))^  +  ]<(x  2])\  and  suppose 
some  .ligorithm  generates  Xq  =  (-2,  -2).  i,  =  -l).  ig  =  (-’.,  0).  If,  in  the 

notation  of  the  preceding  paragraph,  xj^^  =  j  =  '..2.  then 

fo  : 1  fo  :  1 

2j'  ''^2  :oi 
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Since  the  natural  secant  equations  for  unconstraineo  minimization,  + 

-  >*  with  5*  and  1*  defined  by  (3  h).  rarely  will  satisfy  the  conditions  of 
ilieorem  3.1  whenp*>!,  it  might  seem  that  the  topic  of  multiple  secant  equa¬ 
tions  for  unconstrained  minimization  is  fruitless  In  Section  3  d,  however,  we  will 
snow  how  a  practical  algorithm  for  unconstrained  mmimLization  might  generate 
multiple  secant  equations  that  satisfy  the  conditions  of  Theorem  3  1,  without 
changing  the  current  secant  equation.  Sections  3.2-3  4  investigate  updates  cind 
methods  that  are  possible  when  the  conditions  of  Theorem  3. ;  are  satisfied. 


3.2.  Least  change  symmetric  multiple  secant  updates 

The  reader  may  have  noticed  that  the  equation  ,'3. 3)  used  in  the  proof  of 
Theorem  3  1  reduces,  in  the  case  whenp  =  l,  to  the  PSB  update  of  H  =0.  The 
corresponding  update  to  a  nonzero  H  would  be 

fipSBg  =  H  ^  iY-HS)(S'^S)-'S'^  +  S{S^Sy^{Y-HSf  (3.6) 

-  5(5^S)-'(r-//5)^5(5^5)-  '6'^ 

Kquauon  (3  6)  is  a  generalization  of  the  PSB  update  ('.  B)  to  multiple  secant 
equations,  hence  the  name  "PSBg"  Hpsug  is  well-defined  and  HpsBgS  =  Y  &s 
io.ng  rts  6’  nas  full  column  rank,  if  //  is  symmetric,  then  it  is  easy  to  see  that 
Hpspj  -s  symmetric  if  and  o.niy  if  Y’^b'  is  symmetric  The  rank  of  Hpspj-H  is  at 
most  2p  We  show  in  Theorem  3  2  that  if  Y^b  and  //  arc  symmetric,  then  Hp^p^ 
IS  the  least  change  symmetric  update  to  //,  in  the  Krobemus  norm,  that  satisfies 
H  ,S  =  >'. 

Correspondingly,  the  DFP  update  (1.13)  may  be  generalized  to 

f^DFPg  =  H  ^  [Y-HS)[Y'^SY'Y’^  +  Y{Y'^S)  ^[Y-HSY  (3.7) 

-  Y{Y^ S)-\Y-HSy b{Y'^S)  'Y^ 
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Hpypj  IS  well-defined  and  H[)fpgS  -  Y  whenever  Y'^S  is  nonsin^uiar;  it  is  sym¬ 
metric  if  H  ana  Y^S  are  symmetric.  Again,  Hoppg—H  has  rank  at  most  2p.  We 
also  show  ;n  Theorem  3  2  that  if  H  and  are  symmetric  and  positive  definite, 
then  is  the  solution  to 

minimize  ^  -  H)W  ^ ' p 

subject  to  H t  symmetric  and  positive  definite.  s  -  Y  . 
where  is  any  nonsingular  matrix  that  satisfies  tV-'’  f'/  S  =  Y. 

The  reader  also  may  have  noticed  that  the  malnce.'  //^  and  used  in  the 
proof  of  Theorem  3.'.  are  related  to  the  BFG3  update.  In  fact,  if  is  symmetric 
and  pos.Uve  defimte  and 

Z  ^  HS  (5^/75)-'  (3  9) 

tnen  the  matrix  given  by  (3.4)  is 

+  Y{Y'^S)  ^  Y^  -  Hb'  15'^//5)  '‘  ,  (3,10) 

a  ge.ncraiizatjon  of  the  3FCS  update  (M2),  if  HS  and  Y^S  are  nonsingular, 

fiiircsj  IS  well-definea  and  Hgfx;sgS  =  Y,  Hgi^sy  syrrimelnc  if  H  and  Y^S  are 

symmetric,  flgrcsg-f'^  also  has  rank  at  most  2p  Theorem  3  2  also  shows  that  if 

;!  .ir.a  Y‘  b  arc  symmetric  and  positive  definite,  then  Hgrcsg  is  the  solution  to 

minimize  W[Ht'  (-k--\ 

suDject  to  //»  symmetric  and  positive  definite,  H ^  b  -  Y 
for  any  nonsingular  ITeA’"’'"  that  satisfies  W  b  -  Y. 


Theorem  3.2.  Let  p-^n.  symmetric,  d>'.  }'  Y^b  symmetric, 

nnk(.V)  =  p  Then  the  unique  solution  to 

minimize  , //»  -  f!  f  subject  to  //»  symmetric,  //.  b  =  Y  /n 

s  Rpsfiq  given  oy  (3  6)  If  in  addition  H  and  Y^ b  are  positive  definite,  euid 
’,y  r  pnxn  noQsmgutar  matrix  that  satisfies  W’’’ W  b  =  Y,  then  the  umque 


J 
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.soiuiions  to  (3,6)  and  (3.11)  are  Hiyf-pg  given  by  (3  7)  and  Ubfcsj  given  by  (3.10), 
respectively 

l*roof  :  If  5  has  full  column  rank  and  //.  S  are  symmetric,  then  clearly  HpsBg 
gAcn  by  (,3  6)  is  symmetric  and  satisfies  ~  ‘  '^eiw  let  be  any 

symmetric  matrix  satisfying  H ^.S  -  Y.  and  define  h'psog  -  fhsog~f^ •  ^ 

Then  substituting  //+5  for  eacn  occurrence  of  K  m  (3.6)  gives 

i^KBg  =  t^P  +  Pt'  -  PPP  =  PP  +  FEU-P)  (3. 13) 

where  P  =  S{S'^S)~^S^  is  a  Euclidean  projection  matrix  rfecall  that  ,,P  :  <  1, 

,  / : ,  rind  Pi/ -P)  =  0.  We  also  use  the  fact  that  for  any  /./),  '/fg  £7?”’“", 

,  ,W,P  -r  Mzil-P).}  -  M^P  J  +  ..MziJ-P).  pi- 2  trace(.t/;P(/-P)/r/J) 

lA^P  Pi- Ml-P)  }  .  (3.14) 

Thus  from  (3  13)  and  (3.14), 

PpsBg  =  .-.PP  J  +  .  PE{1-P)  } 

^r.PPji-  P.^.Eil-P)} 

=  ^EP.}+  E{I-P).p=  E  } 

with  the  last  equality  coming  from  another  application  of  (3  14).  Thj  /hows  tb?l 
;3  6)  :s  a  solution  to  (3,12).  The  solution  is  unique  because  (3.12,  ,■  j  minimr^a- 
tion  problem  in  a  strictly  convex  norm  over  a  convex  set 

if  a  anc  Y^S  are  symmetric  and  positive  definite,  it  is  straightforward  to 
vcr.[y  that  the  generalized  Di'P  update  (3  7)  is  fi3.4)  with 

E  =  -  r  (y'^.V)  ' 

ric.iny  /^S  -  n.  and  since  //^  is  nonsingular  and  Y.  S  •;  E  has  rank  n-p 

Tnus  from  the  proof  of  Theorem  3.1,  //pppg  is  symmetric,  positive  definite,  and 
satisfies  flDP/igS  =  Y.  The  proof  that  Hpppg  is  the  solution  to  (3  8)  then  follows 
from  applying  the  standard  transformation  of  variables  technique  to  the  above 
proof  for  the  generalized  PSB,  see  for  example  Denms  and  Schnabel  [1979], 
Finally,  since  the  generalized  BEGS  update  (3  10)  is  (3  4)  with  Z  given  by  (3.9), 
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Ihe  sciine  araumenL  shows  that  it  is  symmetric,  positive  definite,  and  satisfies 
-  !  The  proof  that  HrjfQSj  •=>  solution  to  ;;1  also  is  obtained  .n 
the  st.incMfri  way  First  the  duai  !)FP  result  is  oniaincd.  ih.i!  .s,  it  ;s  shewn  that 
the  solution  to  (3  1 1)  is 

H  ^  +  [s-H  ■Y),y^'s)  +  ‘yy’  ;3.is) 

-  S{Y'r:<)-'[S-f!  ^YVSiY^S)  . 

Ihen  It  IS  straightforward  to  show  that  for  //«  given  by  y'i  In),  //,  =  HBPcsg  ’ 


Wu.y  algebraic  properties  of  standard  symrrelr.c  sfc.int  updates  can  be 
extenuec  to  syrrimctric  multiple  secant  updates  For  example,  the  analog  of  the 
Liroyden  one  parameter  class  (Broyaen  19?0|)  is 

// , : './)  =  Hbfcsj  +  y  =  (//b>\'6’'//6-)  -  >  -  r'  r^5)  -  >)  (5" 

where  h'P^'P  is  any  symmetric  matrix,  //»v.'-/)  is  positive  acfinite  if  /W  is  posi¬ 
tive  definite,  and  /■/hfVg  -  Hyi)  Also,  the  ChoiesKy  ractoiezalion  of  Udfps  oc 
l'-:iyv,Fg  d-.ay  DC  obtained  in  O'^n^p)  operations  from  the  Cholcsky  'acionnation  of 
//.  for  example  if  //  =  Z,/.^  then  Mgpcsg  -  wnere 

J  =  L  ^{YC  -  HS)  {S'^HSy^  S^L 

for  6’.:  satisfying 


cy  [Y^.'?)  G  =  S^h’S 

J  can  be  calculated  in  O(n^)  operations,  and  its  LQ  factorization  can  be 
cola. r.ed  .n  an  additional  0{ti^)  operations 


3  3.  BrojccLed  symraetric  sccaoL  updates 

D.ividon  ;  :97b]  proposed  a  quasi-Newton  algorithm  that  finds  the  mimnuzer 
of  a  positive  definite  quadratic  in  at  most  n  +  '.  iterations.  To  accomplish  this, 
each  quadratic  model  interpolates  the  gradients  at  current  and  all  past  iterates; 


Ihdi  IS.  for  eiich  k. 


V7n*.(jrj)  =  V/(Xj'  ,  ■  k  '  P,  '.n; 

Equation  (3  '.6)  implies  that 

where  i.  .'i  i  /V'”'*  are  deriaed  oy 

■5>'fc  i*^j  ~^k  ♦  1  j  j  ”  ■''i  1  ■  '«  ey  '*».  >  1  j  i~''  J  j  /  —  y*  ;  ■  y  ^ 

Similarly  at  the  next  .terat.or..  ijavidoii'j  n'lCthod  requ.re;: 

^  'k  1 1  -S:  -  •'k 

where  5jt,  *  •'. 

■S’kdj  -•''■fcn  J  >*ej  -  L/k .  I  ,  • 

It  IS  straightforwai-a  from  the  above-  derLnitic.is  that 

>*  -  /4-^  =  (y*  - 

that  IS.  .t.y  already  sat  sties  k.  o:  tr.e  k' -i- i  .secant  equations  .r.‘.  .'-c  td  .'.y. , 

Symrnetris:  .secant  updates  that  satisfy  wnt  r,  /  .x'j  ;.  c..-i'tr.^ti^  .inu 

(3  f  7'  IS  true  were  !:,vesi.!ga*  od  by  Dav.ccr.  ..97u^.  a;'.u  s..  use.iu-i.  rtiy  tyv  mery 
authors  including  Jonras  a.na  bchnarei  iJ-o'  .  .Va/.a-et.r  n'-th  ,  ana  dc.anaoca 
1977,  1973;  They  often  are  ca. led  'nroji  cted  .^ocant  u.-ida.  s  " 

Corollary  3  3  show.:  tit.il  tre  :■.ece.^.sa•■  anc  ■  aiT.e.eni  eor.a.lio.ns  I'c."  s  •  "iv 
metric  scc.int  upaat!.  ;  to  sat.s'f;.'  ;r  .  f  .or  gv  .'..■.■■ai  /  ,x}  fceow  ir'.moG.e.tc.y  frcrn 
'lEieorcm  3.‘, ,  if  these  cc.na.tirns  ..r(-  -auisar'd.  t.ne  ..pcates  d.sousscd  ir.  ifs'ct'o:. 
3.2  reduce  to  r.ink  two  apuates 

Corollaj'y  3.3.  i.e-tneT-i  /.'  /V""'-  ^.ymmetne.  .V,  /  ,  rar.K;.s’)  = // ,  s  ^  ,sV  , 

V  =  >1^1. 

Y  ~  HS  =  {y  -  Hs)  P.  :  ;3  in; 

Then  there  exist  symmetric  // ,  for  which  //,d>’  =  Y  if  and  only  if 

[y  -  Hs)  -  np  ,  (.3  ;  9) 

where  a  -  In  this  rase,  the  goncr.ilued  i’bii  update  ^3  6)  ;s  a  rank 


two  update  of  i!  If  II'  lO.'iit'.oii  i;.  p.isiiivo  ili-rtnito  s>  uum-u 

and  {losilivf  dcflniU;  //,  foi  whica  fi  ,S  ~  Y  it  ,\t\a  oniy  if  ^.5  is  satisfied  ai.d 

,  T  O  T  >  0  ,  (  <  -  - 

where  t  -  e[(5^//5)  ^e^  In  tn  s'  eas^-,  ooth  the  generalized  uVT  npa  at-  ;a  r' 

ajid  ihe  gL'iu  ralizod  iil'Gn  up'dalc  0^  are  pcsii  v<.  d lc  ano  '■arz  r.^o 

updates  of  // 

l*roor  ;  ik-fiP.e  f  =  */-••'''  ■:  .'oni  'i'hcnran  i  .  .hereiv  ''  -yr.natr:.  -  ..  f,- 

ing //,f5  =  y  if  and  only  .f  X  ^ yn^ntcd  .-.c  rVon; 

;  ^5  -  s'' -r  ..  i  ^S'  t '  ''' 

Since  H  IS  symmetric,  is  .symn.etm  if  afi-i  or.'.y  '  p  ;  .  -..(."..i-  muji.pic 

Cj  Since  [s' t)  '  ]  -  s'^!  -  a,  Vfi:.-;  r-i  pcs'-'ibie  if  aiu;  c;,.,  '  'T  ',9;  is  '.'oe  If 
(3  '.9)  IS  satisfied  tnen  iSe  gentTair/oo  i-’Sif  updali  ,s  s'.'-.n't-.  '.iiC,  ana  sm:  Ma' 
ing  (3  i8)  and  (3, '9)  into  ;3  S')  shows  that  in  this  ra-io  .1  i.'  t  r.mK  two  opiiati, 

J  . 

!>,  o  </  +  ly-i/M'v;  4  S.y-//s)  --  a  'S 

wtiere  s  ~  5(S'^S)  “e  . 


Also  from  Tfieort  r:; 

i  mere  x.si  m..- 

.'I'.v  cc:r;nlc  /•/ 

.  .‘or  wnic^*  i! . 

and  only  if  V'  S  .s  syn  ;; 

lelr.e  and  p.'.'ltiv. 

:U f.  r  /:  1 

A  V  tru’ricLriC  .ir.i'. 

l.vc  definite  and  9'  '.oai  mi,;.  ■':'i  ii.  in.-  ab''''< 

*  j  c  :3  -f. 

IS  symmetric  Sim  c  S  /IS  r;  '  d'  I.  i  ...  n  t;.  'r.:'':  f  'ld'ai 

IS  jio.i'l'.vs;  di'!.;.  It  ....1  ■  :i.;  '  1  rm  -P  -  im  ,  .  ah.',  'ii*;;., 

i'll  '  H)  .M.o  ;3  :9;  'M  , .  yi  .mzen  i>.  '  ...id,.!  t  ,  .  i  .m,  i  ..sc  i..  ■■■■ 

^  <h  *  '  sy 

■where  y  -  ¥[)''' S',  n; ,  .Sdl>.-:t',tuting  ;d  and  iod;)  inio  -.o  ‘.G)  and  asing  'me 
Sherman-  Vornson-'A c.iCD'.irv  formula  for  tne  inverse  of  f  t  .il)  gives  the  general¬ 
ized  Ht’GS  upd.ite  in  tn;--  i  .i.-i  le  !'■! 


when;  .V  =  S[S''f!S)  'k,.  y  -  Y{S‘  IIS)''(i  ^  -  //  s  i-rt 


The  necessary  and  suflicienL  conditions  for  projccied  syncmeiric  secani 
updates  Lo  interpolate  several  pest  gradier.is  have  hern  aiseusseC  by  sevcrai 
authors,  starting  with  Schnabel  |!9V7l  As  wc  already  have  .ndiCatcd,  thev 
rarely  are  satisfied  if  /  'z)  is  nonquadratic,  even  if  (3  ’  7'  is  trui  In  our  opri.or 
this  IS  tne  fundamental  retison  why  projcrtea  syrr..T.;'ir,c  secar.l  updates  na'  ' 
not  been  an  improvement  over  the  HKGS  m  practice  Th.e  pr  y,.  tea  upaate 
Was  proposed  by  Scnnabel  ,  9771  and  an  aigor't.nm  imn  u.ies  it  was  >.-i.;un 
to  be  q-3uperlinearly  convergent.  ,f  f[z)  js  quaarat.c  , Id',  >  the  uua.  of 
update- originally  propo.sed  ny  Dav.uon  '9 Vf.'  pru^ectao  ,i, To  updati  y'  ; 

IS  derived  Dy  Dennis  and  bcnnabei  .To'  ' 


3.4.  ^pcHincar  convergence  of  quasi-Ncwton  methods  using  symmoL-ic  mul¬ 
tiple  secant  updates 

\  icca.  ..leti.Ot.  i.i.i  mist.'  .lUicd  iTiiriiini/ation  oasen  tr.v,  '''i  .iintelr  i 
mult, pie  sfcant  uptiat'^s  di'ru.'sec  .n  Section  i,2  is  to  sc-icci  eacr.  iterati  z^.-  to 
o(;  tne  cr.i.cai  po.nt  r.f  the  current  qua.tratio  .mocai, 

♦  I  -IT  ■  U  ^  J  \^k.  f  ■ 

then  choose  p t .  T*  ■  /f  '*  >ucn  tnat  /t  dr't  is  symmetric,  ana  upaate  oy  tne 
genfr-il./cr:  i'S[<  ,  1,7,.  ;  1  or  if  .also  is  posil.vo  acfniti.  bv  the  .generV 

i/cd  Dr'P  7)  or  HtlJa  id)  update.  {When  we  refer  to  updates  3  6,  ,3  7  or  3  ;C 
in  this  section,  wc  assume  that  the  symbols  HpsBg-  fhrpg-  ■ind  Haresg  i-  these 
formuia.s  have  been  converted  to  //*,!.  and  that  all  other  svmbols  in  these  for¬ 
mulas  have  been  given  the  subscript  )  In  this  section  we  show  that  if  idi*  j  and 
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obev  ihe  same  conditions  (2. ‘.3-14)  as  were  required  for  the  ioced  conver¬ 
gence  of  the  multiple  secant  method  for  nonlinear  equations,  then  any  of  the 
aforementioned  methods  for  unconstrained  minimization  is  locally  aind  q- 
superkneariy  convergent  to  a  minimizer  of  /(z),  under  standard  assump¬ 
tions  Tntorem  3.4  proves  the  local  q-superlinecir  convergence  of  the  method 
that  uses  the  generalized  PSB  update.  The  proof  is  based  on  Broyden,  Denms, 
and  Store  ’.973]  and  Dennis  and  Word  ]1974],  and  is  very  similar  to  the  proof  of 
'rneorem  2o  Theorem  3  5  states  the  analogous  result  for  methods  using  the 
generalized  DPP.  or  BFG3,  updates.  The  proofs  would  follow  from  the  proof  for 
the  PSB  method.  Since  these  proof  techniques  are  so  well  established,  we  omit 
the  proof  of  Tneoreni  3.5  and  just  make  a  few  comments  about  it. 

Theorem  3.4.  Let  F  ;  be  continuously  differentiable  in  an  open  convex 

set  f).  and  assume  there  exists  z*eZ?.  t;>0.  and  y^O  satisfying  N(x,,r})cD, 
/'';z,)  =  r,  F  [x,)  IS  symmetric  and  nonsmgular.  and  ■  F'{z)  -  F'[x),,  -&  y  .z-x 
for  di;  z  z  :  .\'{x*.ri).  Consider  the  sequences  {zi^i  and  generated  from, 
xcc/t"'  ana  a  symmictnc  by 

=  X*  -  Hk'F{x^) 

u.nd  me  generalized  PSB  update  (3.6).  where  {S'*!,  ])*(  with  each  p* 

t  '..Mj  .And  each  symmetric.  Suppose  there  exist  C]>0,  C2>1,  j&O,  such 

that  for  a:  =  0,  1.  •  , 

>*  i|5*ii  max|;,zt_,  “Z.ij  ,  i  =  -l,  0 . g*  (3.24) 

and 

/f(;,5*„)<c.  (3.25) 

where  each  7^  <  maxjfc.gj.  Then  there  exist  c^O.  5>0  such  that  if  iizg  -z,;]  ^  t 

and  //c  -  r  (z,),|  <  5.  the  sequence  Jz*  j  is  well-defined  and  converges  q-linearly 

to  z,,  and  ]//*],  are  uniformly  bounded.  If  in  addition,  for  each  k  there 

exists  Vjc  (.7?^*  for  which 
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5*:  “■***■  1  ■ 

then  Ihe  rale  of  convergence  is  q-superlinear. 

Proof  :  Lei  //,  =  F  =  {Ht  -  H,).  =  6*(6Y6’/fc)  Then  from  (3.6)  il 

IS  slraighlforward  to  obtain 

£■*.1  =  £■*  +  {Yic-H,  5*)(5/5»:)-*5j-  PiEt  +  S^) 

+  PtEi^Pi,  -  Sj,iS^S^y\Y^-H,  S\y  P* 

=  (/-P*)P*(/-P*)  +  [Yt-Hu  5*)(5j5*)-'5j' 

+  StiS;[S^)-'{Y>c-H,  S\y  (f-Py  (3.26) 

Tnus  using  \J -Pk.x'^  -■  (3.24),  and  the  definition  (2.11)  of  /u^. 

lEk.YF  ^  yJ-Pk^  +  li >*-//.  ,.(5/i(t)'‘5/  (:  +  J-Pk  ) 

^  'E^tlp  +  2  c,  A'(5*)m*:  ■ 

Therefore  from  (3. 25),  +  i  satisfies  (2.10)  with  ai  =  0  and  aj  =  20)02,  which 
proves  q-linear  convergence  from  Theorem  2.3.  To  prove  q-superhnear  conver¬ 
gence.  derive  from  (3  26) 

i:£*<-lii^’^  ■.£*  {f-Pk)\-.F  +  2  0)  02^* 

The  remainder  of  the  q-superiinear  proof  then  is  identical  to  the  q-superlinear 
proof  in  Theorem  2  5,  • 

Theorem  3.5.  Let  the  assumptions  of  Theorem  3  4  hold,  and  assume  in  addition 
t.hat  /■'  (x,)  IS  positive  definite.  Then  Theorem  3.4  remains  true  if  the  general¬ 
ized  P53  update  (3  6)  is  replaced  by  the  generalized  DFP  update  (3  7).  or  by  the 
generalized  BFGS  update  (3  10). 

The  convergence  proof  for  the  generalized  DFP  method  is  very  similar  to 
the  proof  of  Theorem  3.4.  The  modifications  required  are  similar  to  the 
modifications  Broyden,  Denms,  and  Mor6  [1973]  use  to  convert  their  proof  for 
the  PoB  method  into  a  proof  for  the  DFP  method.  Bounded  deterioration  is  pro¬ 
ven  using  the  weighted  Frobenius  norm 
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{H,  -  Hr,) 


IL  IS  sira;^ht.forweu"d  to  show  from  (3  7)  that 


A*.,  = 


where 


K,  =  ,  5*  =  .  P*  =  '5* 


and  from  (3  24). 

il/  -P*,1<  :  ^  0(m*)  .  .;Pt',<  1  +  0(m*) 

Uncar  convergence  follows  easily  from  these  relations  and  Theorem  2.3,  and  q- 

.>Liperlincar  convergence  from  the  same  techniques  used  in  tne  proof  of  Theorem 

3  ;  Tne  convergence  proof  for  the  generalized  Bl-’GS  method  is  essentially  the 

dual  of  the  DPP  proof,  as  in  Broyden,  Dennis,  and  ,Vfor6.  iVote  that  y/j*  positive 

Oefinitc  IS  implied  by  (3  24)  and  F'{x„)  positive  definite. 

The  crucieil  question  is  whether  there  exist  reasonable  choices  of  }5>.i  and 
jP*i  that  satisfy  the  conditions  of  Theorems  3  4  and  3.b.  The  following  section 
proviGcs  a  positive  answer  to  this  question. 


3.b  Korming  multiple  secant  equations  for  unconstrained  optimization 

The  o.ovious  use  of  multiple  secant  equations  in  an  unconstrained  minimiza¬ 
tion  algorithm  would  be  to  allow  the  quadratic  model  (3.1)  of  /(i)  around  z*+i 
to  interpolate  gradients  alp*>l  past  iterates  .7  =  '  ’  .P*.  where 

*:  =1,*  >la*  >  >lp^k^0  (3.27) 

Tnis  woi,.;d  require  the  model  Hessian  /4ti  to  satisfy  p*  secant  equations 

H^rxS^  =  Yt  (3.28) 

where  5*.  YicCR’'*^'‘  are  defined  by  (3.5).  Unfortunately,  Theorem  3.1  shows  that 

(3  20)  IS  consistent  with  P**!  symmetric  (and  positive  definite)  only  if  y*’S*  is 


symmetric  (and  positive  definite),  and  Example  3.:  indicates  that  this  is  unlikely 
for  nonquadratic  f[x).  In  this  section  we  discuss  several  ways  to  perturb  Yt  to 

~  1 30  that  YicSic  IS  symmetric  (and  positive  definite).  These  methods 
all  yield  (Ay*;)ei  =  0,  that  is.  the  standard  secant  equation  is  unchanged,  and 

they  all  generate  sequences  and  that  satisfy  the  conditions  of 

Theorems  3.4-5  for  loceil  q-superlinear  convergence.  The  general  aim  of  these 

methods  is  to  perturb  Yi^  as  little  as  possible  consistent  with  Yj^Sj^  symmetric, 
and  to  change  more  recent  secant  equations  less  than  less  recent  secant  equa¬ 
tions. 

For  the  remainder  of  this  section,  we  assume  that  and  are  defined 
by  (3.5,  3.27),  with  chosen  by  a  procedure  that  guarantees  A'(5*) 

sufficiently  small;  a  suitable  procedure  is  described  at  the  end  of  Section  2.  We 
also  drop  the  subscripts  k  for  the  remainder  of  this  section.  Now  we  describe 
our  first  strategy  for  calculating  AT 

It  IS  trivial  to  calculate  the  lower  trieuigular  matrix  for  which 

=  -Z,  +  //  (3.29) 

Note  that  the  diagonal  of  L  is  zero.  From  (3  29),  ( +  L)  is  symmetric.  Our 

first  strategy  is  to  choose  AF  such  that 

{AY)''  S  =  L  (3.30) 

Equation  (3  30)  implies  that  for  each  column  {AY)ej  of  AY.  only  ((AlOez)^!-^^,), 

need  be  nonzero.  Thus  we  may  choose  [AY)e  ^  =  0,  leaving  the  standard 

secant  equation  intact.  This  choice  is  guaranteed  if  we  choose  the  smallest  A>' 

that  satisfies  (3  30),  in  the  Frobenius  norm  From  Theorem  2.1.  it  is 

AT  =  5  (5^5)-'  .  (3.3i) 

The  above  choice  of  AT  guarantees  that  {Y^AYYS  is  symmetric,  but  not 
necessarily  that  it  is  positive  definite  An  easy  modification  that  assures  positive 
definiteness  is  to  first  choose  a  subset  of  the  rows  and  columns  of  {Y^ S-^L)  that 
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IS  positive  definite,  and  restrict  the  past  points  used  to  this  subset.  This  selec¬ 
tion  IS  easily  accomplished  usin^  a  modification  of  the  Cholesky  decomposition 
of  '  y  S-rr),  the  normal  decomposition  is  attempted,  but  if  the  addition  of  the 
row  and  column  would  cause  the  matrix  not  to  be  positive  definite,  then  the 
past  pomt  (and  the  row  and  column  of  ^Y’^S-rL))  is  eliminated.  If  the 
norrr.al  line  search  condition  in  a  quasi-Newton  method  for  mimmization, 
( >e  //(N'e  i)  >  0,  IS  satisfied,  then  this  strategy  always  retains  the  current  secaint 
equation. 

In  Example  3.2  we  apply  this  strategy  to  Example  3  1. 


Example  3.2.  Let  S,  Y  be  the  matrices  and  7,  from  Example  3.1.  Then 


_  f  0 
■  1  -6  0 


can  be  retained 


7^5  -  S^Y  = 
Since  {Y^S  +  D  =  [f 
From  (3,3l) 


0  -6 
6  0  , 

IS  positive  definite,  both  past  points 


^  r  ^0  -2 

A7  =  5  (5^5)-'  =  \ 

I  0  -6  J 

so  that 


7  = 


0 

2 


13  1 


it  is  easy  to  show  that  under  the  assumptions  of  Theorem  3.4.  there  exists 

c  >0  for  which  ,  7-  F  (x,)5,’,'  «  c  .S  fu,.  fj,  defined  by  (2. 1 1)  Let  //,  =  F'{xg).  We 
showed  in  the  proof  of  Corollary  2.6  that 

i:7  - y  S  , 

so  that 

i;7^5  Vn  y,,S:[^/j.. 

Therefore 
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'■L'y  =  (:/V2)  .-L  +  l7  p  ^  [•-/'■J'2)  y^S  -  S'^Y.p 

<(:/>/5)  {Y'^S-S^H.S) -{S^Y  -  S'''H,S)i:p-s^J^yiS  I^fu. 

and 

„Ar,f  <  ,  L  -p^y/^  y  h'(S)  S/M  . 

Tt'.u.s 

llY- H,S  ,  ,p<  .Y  -  H^S.p  +  ^Y..^■^  c  (3  32) 

where  n  ~  'Yn  y{y/2K[S)  +  1)  hYom  Theorems  3.4-5,  this  implies  that  a  general¬ 
ised  P5B,  DFP.  or  BFGS  algorithm  that  chooses  j5*j  and  J  to  satisfy  the  con¬ 
ditions  of  Corollary  2  6.  and  modifies  by  (3  3!),  will  be  locally  q-superiinearly 
convergent  lo  a  minimuer  x,  where  (i,)  is  nonsingular.  Sufficiently  close  to 

I,,  ;3  32)  guarantees  that  Y  S  will  be  positive  definite. 

V.'hen  using  multiple  secant  equations  in  conjunction  with  a  generalized  DFP 
or  UFCS  update,  it  may  be  more  reasonable  to  find  the  smallest  AF,  in  a 
weigtued  Frobenius  norm,  that  satisfies  (3.3C)  It  is  straightforward  to  show 
that,  for  ii'eA"'*"  nonsingular,  tne  solution  to 

minimize  subject  to  (AF)^  5’  =  L 

arc/?"**’ 

IS 

AF=  WS  W'^  WS)  '  L’’’  . 

If  we  assume  tnat  the  past  points  have  been  restricted,  if  necessary,  so  that 
{Y^S  +  !,)  13  positive  definite,  then  a  reasonable  choice  is  W  for  which  W^WS  = 
(  F  +  AO.  it  IS  easy  to  show  that  this  choice  results  in 

AF  =  YiS^Y)  ' 

F  =  1  F  +  AF)  also  can  be  shown  to  satisfy  the  conditions  of  Theorems  3  4-5 

We  briefly  describe  a  second  strategy  for  perturbing  F  that  may  come 
closer  to  the  goal  of  changing  recent  information  as  little  as  possible.  It  is  to 
change  each  column  of  F  only  as  much  as  necessary  to  meet  the  symmetry 
requirements  imposed  by  more  recent  (already  revised)  secant  equations. 
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AieeDriiicdlly.  this  means 

Algonlhm  3. 1. 

■;  Set  ;A)0ei  =  0 
2.  I’or  ]  =2,  .p  do 

2.1.  Select  (Se/t"”  to  minimize  ,6, 

subject  to  (yej+6)^5ej  =  y+Ay)ej,  i  =  l,  .j-'.. 

2.2  Set  (Ay)ej  =  6 

That  is.  column  ]  of  Ah  is  chosen  to  be  as  small  as  pos.siole  subject  to  the 
iCiumn  of  the  j  xj  principal  submatrix  of  +  Ah)  equaling  the  j"*  row  of  this 

submatrix  The  first  tw'o  columns  of  Ay  generated  by  .Algorithm  3.1  are  the 
same  as  are  those  generated  by  (3.31),  the  remaimng  columns  would,  in  general, 
be  diflercnt, 

S'''[  y  +  Ay)  generated  by  Algorithm  3  1  also  might  not  be  positive  definite. 
It  13  easy  to  modify  Algorithm  3.'.,  however,  to  generate  5^(y  +  Ay)  positive 
definite,  by  generating  iteratively  the  Cholesky  factorization  of  the  current  ;x; 
principal  submatnx,  and,  if  the  step  fails  to  keep  the  submatrix  positive 
definite,  eliminating  this  point  from  the  set  of  past  points  used  at  that  iteration. 

Algorithm  3  1  has  a  close  relationship  to  our  first  strategy  for  choosing  Ay. 
from  step  2  1,  Ay  =  SU  .  where  L  is  lower  triangular  with  zero  diagonal.  Thus 
(y  +  Ay)^5  =  (y^5  +  LS'^'S)  is  symmetric,  so  Algorithm  3.1  is  equivalent  to 
flnaing  tne  unique  lower  triangular  (with  zero  diagonal)  L  for  which 

Y'^S  -S'^Y  =  , 

nnu  then  choosing  Ay  to  solve 

minimize  !!Ayi|ji’  subject  to  £iY^S  =  LS^S. 

( y  +•  Ay)  generated  by  Algorithm  3.1  also  obeys  the  conditions  of  Theorems  3.4- 
b,  since  it  can  be  shown  that 
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uiy,f<\/ny(:  +  A'{S)y  ‘  s  u 

Since  p  and  K[S)  will  b“  smail  in  practice,  the  constant  in  the  above  equation  is 
not  too  large  Finally,  a  weighted  version  of  .Algorithm  3  *.  can  be  obtained  by 
changing  the  norm  in  step  2  !  to  a  weighted  norm 

The  strategies  given  in  this  section  may  not  be  tr.c  oesi  ways  to  generate 
multiple  secant  equations  for  minimrzation.  They  do  show,  however,  that  reason¬ 
able  choices  of  jS*i  and  j  exist  that  satisfy  ootn  the  existence  conditions  of 
Theorem  3  1  and  the  local  q-superlinear  convergence  conditions  of  Theorem 
3  ‘r-h  Vaybe  they  will  lead  to  successful  computational  algorithms.  We  do  think 
mere  is  a  significant  difierence  between  the  strategies  of  this  section  and  algo¬ 
rithms  that  have  used  projected  updates  such  as  those  aiscussed  in  Section  3.3. 
'ATule  both  approaches  interpolate  multiple  past  gradients  when  /  (a:)  is  qua¬ 
dratic  the  strategies  of  this  section  should  come  closer  to  interpolating  past 
gradients  for  non-quadratic  functions,  because  they  do  not  compound  the  inter¬ 
polation  errors  of  previous  updates.  The  cost  is  a  higher  rank  update. 
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